NoSQL operator: filtertable

Runs standard utilities, such as grep(1), sed(1), etc., on a NoSQL
table passed via STDIN.

Usage: filtertable [options --] filter [args]

Options:
    --input (-i) 'file'
      Read input from 'file' instead of STDIN.

    --output (-o) 'file'
      Write output to 'file' instead of STDOUT.

    --help (-h)
      Print this help info.

    --no-header (-N)
      Suppress header from output table.

    --
      Marks the end of 'filtertable' options and the beginning
      of the actual filtering utility.

'filter': the utility to be run (grep, sed, ...).

'args': any extra arguments and options for 'filter'.

Notes:

This operator reads a NoSQL table via STDIN and runs the specified
'filter' program on the table body. Any options and arguments that
are suitable for the specified filter can be given on the command
line.

This operator can also be used as a pre-processor for other NoSQL
commands, to boost performaces on large tables. For instance, on
a 20000+ record table I got these results:

time getrow 'Field ~ /keyword/' < bigtable.rdb
real    0m0.400s
user    0m0.350s
sys     0m0.050s

time filtertable grep keyword < bigtable.rdb | getrow 'Field ~ /keyword/'
real    0m0.079s
user    0m0.030s
sys     0m0.040s

i.e. a performance improvement of 500% !

Possible uses for this operator are limited only by your imagination
(and by the availability of suitable unix filters). For instance, it
can be used as a better/faster alternative to 'getrow' if all you need
is to pick a list of primary keys from a NoSQL table, like this:

      filtertable -- grep -f pick_list < input_table

where 'pick_list' is a file containing one key per line, each prepended
by a caret (^) and followed by a TAB, to make sure that it matches the
table leftmost field, that is the primary key.

Likewise, if you want to delete a given set of keys from the table you
can simply take advantage of sort(1) '-v' option:

      filtertable -- grep -v -f delete_list < input_table
Back